Finding Information in Audio: a New Paradigm for Audio Browsing and Retrieval
نویسندگان
چکیده
Information retrieval from audio data is sharply different from information retrieval from text, not simply because speech recognition errors affect retrieval effectiveness, but more fundamentally because of the linear nature of speech, and of the differences in human capabilities for processing speech versus text. We describe SCAN, a prototype speech retrieval and browsing system that addresses these challenges of speech retrieval in an integrated way. On the retrieval side, we use novel document expansion techniques to improve retrieval from automatic transcription to a level competitive with retrieval from human transcription. Given these retrieval results, our graphical user interface, based on the novel WYSIAWYH (“What you see is almost what you hear”) paradigm, infers text formatting such as paragraph boundaries and highlighted words from acoustic information and information retrieval term scores to help users navigate the errorful automatic transcription. This interface supports information extraction and relevance ranking demonstrably better than simple speech-alone interfaces, according to results of empirical studies.
منابع مشابه
Beyond the Query-By-Example Paradigm: New Query Interfaces for Music Information Retrieval
The majority of existing work in music information retrieval for audio signals has followed the content-based query-by-example paradigm. In this paradigm a musical piece is used as a query and the result is a list of other musical pieces ranked by their content similarity. In this paper we describe algorithms and graphical user interfaces that enable novel alternative ways for querying and brow...
متن کاملEnhancing Sonic Browsing Using Audio Information Retrieval
Collections of sound and music of increasing size and diversity are used both by typical computer users and multimedia designers. Browsing audio collections poses several challenges to the design of effective user interfaces. Recent techniques in audio information retrieval allow the automatic extraction of audio content information. This information can be used to inform and enhance audio brow...
متن کامل"I Just Played That A Minute Ago!" - Designing User Interfaces For Audio Navigation
The current popularity of multimodal information retrieval research critically assumes that consumers will be found for the multimodal information thus retrieved and that interfaces can be designed that will allow users to search and browse multimodal information effectively. While there has been considerable effort given to developing the basic technologies needed for information retrieval fro...
متن کاملPrototyping a Vibrato-Aware Query-By-Humming (QBH) Music Information Retrieval System for Mobile Communication Devices: Case of Chromatic Harmonica
Background and Aim: The current research aims at prototyping query-by-humming music information retrieval systems for smart phones. Methods: This multi-method research follows simulation technique from mixed models of the operations research methodology, and the documentary research method, simultaneously. Two chromatic harmonica albums comprised the research population. To achieve the purpose ...
متن کاملSpoken Content-Based Audio Navigation (SCAN)
We describe SCAN, a system for retrieving and browsing speech documents from large audio corpora that uses new information retrieval and speech processing techniques to create easily navigable presentations of documents relevant to a user query. Experiments show that the new interface is more effective than simple speechalone interfaces.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999